Fast and space efficient PNN algorithm with delayed distance calculations

نویسندگان

  • Timo Kaukoranta
  • Pasi Fränti
  • Olli Nevalainen
چکیده

Clustering of a data set can be done by the well-known Pairwise Nearest Neighbor (PNN) algorithm. The algorithm is conceptionally very simple and gives high quality solutions. A drawback of the method is the relatively large running time of the original (exact) implementation. Recently, an efficient version of the exact PNN algorithm has been introduced in literature. In this paper we give a faster implementation of this algorithm. The idea is to postpone the updating of the nearest neighbor information in order to reduce the number of cluster distance calculations. Correctness of the algorithm follows from the monotony of the cluster distances. Practical tests show that the new organization of the algorithm decreases the running time of PNN by ca. 35 per cent.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast PNN-based Clustering Using K-nearest Neighbor Graph

Search for nearest neighbor is the main source of computation in most clustering algorithms. We propose the use of nearest neighbor graph for reducing the number of candidates. The number of distance calculations per search can be reduced from O(N) to O(k) where N is the number of clusters, and k is the number of neighbors in the graph. We apply the proposed scheme within agglomerative clusteri...

متن کامل

الگوریتم مستطیل آبشاری و ماتریس انتقال در شبکه های کوتاه ترین مسیر بادور

Shortest path problem is among the most interesting problems in the field of graph and network theory. There are many efficient matrix based algorithms for detecting of shortest path and distance between all pairs of this problem in literature. In this paper, a new exact algorithm, named Cascade Rectangle Algorithm, is presented by using main structure of previous exact algorithms and developin...

متن کامل

Calculation of One-dimensional Forward Modelling of Helicopter-borne Electromagnetic Data and a Sensitivity Matrix Using Fast Hankel Transforms

The helicopter-borne electromagnetic (HEM) frequency-domain exploration method is an airborne electromagnetic (AEM) technique that is widely used for vast and rough areas for resistivity imaging. The vast amount of digitized data flowing from the HEM method requires an efficient and accurate inversion algorithm. Generally, the inverse modelling of HEM data in the first step requires a precise a...

متن کامل

gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences

Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...

متن کامل

Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features

Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998